Dynamic presence filter

ABSTRACT

A speech processing device operative to correct deficiencies in speech programs caused by inadequate energy in the &#39;&#39;&#39;&#39;presence band&#39;&#39;&#39;&#39;. The device determines the relative amount of total signal energy in the presence band, and, if it is inadequate, automatically boosts the amplitude of presence band components to a level to obtain a more optimum spectral distribution. The circuit is designed to operate with an automatic speech-music discriminator which inhibits control action during music programming.

United States Patent 1 Allen et ai.

DYNAMIC PRESENCE FILTER Inventors: Richard G. Allen, Pound Ridge,

N.Y.; Emil Torick, Darien, Conn.

[73] Assignee: Columbia Broadcasting System Inc New York, N.Y.

Filed: Mar. 15,1972

Appl N0.: 234,913

Related U.S. Application Data Division of Ser. No. 47,214, June 18, 1970, Pat. No. 3,668,322.

U.S. Cl. ..330/59, 330/29, 330/ 144, 333/18 T Int. Cl ..H03f 17/00 Field of Search ..330/29, 59, 144, 330/145; 333/18 T, 28 T; 179/1 D [451 May 22, 1973 [56] References Cited UNITED STATES PATENTS 3,281,706 10/1966 Morris et al ..330/59 Primary ExaminerRoy Lake Assistant Examiner-James B. Mullins Attorney-Spencer E. Olson [57 ABSTRACT A speech processing device operative to correct deficiencies in speech programs caused by inadequate energy in the presence band. The device determines the relative amount of total signal energy in the presence band, and, if it is inadequate, automatically boosts the amplitude of presence band components to a level to obtain a more optimum spectral distribution.

The circuit is designed to operate with an automatic speech-music discriminator which inhibits control action during music programming.

3 Claims, 7 Drawing Figures CONTROL SIG/VAL I I I I I I I I 25 I I I I I I I I I Patented May 22, 1973 5 Sheets-Sheet 1 lll'llll llLllll 2 3 456 8/000 2 3456 8/0 20000 FREOUENCY IN HERTZ llllllll ll-llllll llllllll FREQUENCY l/V HERTZ DYNAMIC PRESENCE FILTER This is a division of application Ser. No. 47,214, filed June 18, 1970 now US. Pat. No. 3,667,668.

BACKGROUND OF THE INVENTION This invention relates to signal processing apparatus, and more particularly to an automatic signal processing device for use with broadcasting equipment, for example, for improving the quality and intelligibility of poor audio signals, particularly speech signals.

It is known that in the speech process, most of the energy is required by the vowels, which generally produce low-frequency audio signal components from about 100 to 1,000 Hz; however, most of the information or intelligibility of speech is conveyed by the consonants, which generally produce the higher-frequency audio signal components in the range of 2,000 to 4,000 I-Iz. Because the higher-frequency components of speech are more rapidly attenuated with distance than the lowfrequency components, the character of speech heard at close range differs from that heard at a distance; the attenuation of the higher-frequency components diminishes the intelligibility of the speech. Speech heard at close range, where the high-frequency energy is not appreciably attenuated, is said to have .presence; hence, the 2,000 to 4,000 Hz band is called the presence band.

The sound-energy amplitudes for consonants are naturally to db below those for vowels. In close proximity, the human ear, being far more sensitive in the consonant region, compensates for this, but when the high frequency levels are further reduced, as by excessive distance or by electronic equipment having poor frequency response, the ear is less able to decipher the components of the speech and understanding or intelligibility is impaired. Modern broadcasting stations derive their programs from a wide variety of sources, both from within studios and remote from them, and despite conscientious maintenance, there are many opportunities for upsetting the spectral balance between the high and low frequency components which cause degradation in speech quality. One of the most common causes of high frequency loss is azimuth alignment errors in tap recorders, particularly in cartridge playback apparatus. Different brands and tape thicknesses require different bias settings, seldom provided in cartridge machines. Other factors contributing to degradation of speech quality are excessive recording levels, incorrect line equalization, poor quality remote equipment, and excessive talker-to-microphone distances. As a result, the intelligibility of speech program is frequently poor--due to the attenuation of vital consonant energy in the 2,000 to 4,000 Hz presence band.

SUMMARY OF THE INVENTION It is a principal object of the present invention to provide apparatus for automatically correcting the quality of poor audio signals.

Another object of the invention is to provide a control unit for use in conjunction with broadcasting equipment for automatically correcting the quality of audio signals prior to broadcast.

Another object of the invention is to provide a control unit which improves poor quality signals in speech programming but which may be advantageously used for all types of audio program material and which may be automatically inhibited, without operator attention, during musical programming.

Briefly, these and other objects which will become apparent as the description proceeds, are achieved by continuously analyzing the program material to determine the proportion of the total energy in the 2,000 to 4,000 Hz presence band, and if it is determined to be insufficient, boosting the amplitude of the presence band signal components to a level to obtain optimum spectral distribution for maximum intelligibility. The system includes a speech-music discriminator which automatically inhibits the equalizing action during musical passages, and the circuit is so designed that when no correction of the audio signal is necessary, there is no equalizing action and the system functions as a flat amplifier.

DESCRIPTION OF THE DRAWINGS Other objects, features and advantages of the invention, and a better understanding of its construction and operation, will become evident from the following detailed description, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a graph illustrating the long-term acoustic spectrum for male voices;

FIG. 2 is a graph of a phon equal loudness contour;

FIG. 3 is a block diagram of the dynamic presence equalizer according to the invention, shown in a typical installation;

FIG. 3A is a block diagram of an alternative embodiment of a portion of the circuit of FIG. 3;

FIG. 4 is a circuit diagram of a portion of the system illustrated in FIG. 3;

FIG. 5 is a series of curves illustrating the operating characteristics of the system of FIG. 3; and

FIG. 6 is a block diagram of the speech-music discriminator portion of the system of FIG. 3.

DESCRIPTION OF THE PREFERRED EMBODIMENT The present invention is based upon the recognition that in speech most of the energy is required by the vowels, which usually produce low-frequency audio signal components about to 1,000 Hz, but that most of the information is conveyed by the consonants, which generally produce audio signal components in the range of 2,000 to 4,000 Hz. For example, the work of Sacia and Beck described in an article entitled The Power of Fundamental Speech Sounds, Bell System Technical Journal, 1926, 5, 393-403, indicated an average power of 18 microwatts for vowels but only 0.6 microwatt for consonants. This is also illustrated in FIG. 1, which shows the long-term acoustic spectrum for male voices measured 18 inches in front of the lips. It will be noted that the signal level at 4 KHz, where a significant quantity of the consonant information is contained, is 20 db lower than in the range from 100 to approximately 400 Hz. Under normal conversation conditions, this disparity in levels is conveniently compensated by the equal loudness hearing contour. Loudness is defined in the Acoustical Terminology publication of the American Standards Institute as the intensity attribute of an auditory sensation, in terms of which sounds may be ordered on a scale extending from soft to loud. Fundamental to attempts to measure loudness has been the derivation of equal loudness contours which graphically depict the measurement of levels of sound of equal loudness as a function of frequency and intensity, and are obtained by loudness balance processes. A set of equal loudness contours are illustrated and described in an article by Bauer and Torick enti-,

tled Researches in Loudness Measurement, IEEE TRANSACTIONS ON AUDIO AND ELECTRO- ACOUSTICS, Vol AU-14, No. 3, pp. 141-151, 1966. These contours were obtained by subjecting test teams to octave bands of pink noise, pink noise being characterized as having equal energy distribution per octave band. The 70 phon equal loudness contour developed in this study is shown in FIG. 2 in the inverted, or sensitivity, form. It will be noted that there is a pronounced peak in ear sensitivity of 4,000 Hz. If the long-term acoustic spectrum of male voices (FIG. 1) is combined with the equal-loudness contour shown in FIG. 2, it will be seen that the increased high-frequency hearing sensitivity compensates very well for the decreasing speech energy at high frequencies.

It is seen, therefore, that although the consonants, which supply most of the information of speech, are transmitted at a much lower level than the vowels, the human capacity for comprehension of the message is not impaired, because the ear is far more sensitive in the consonant region. Some of these higher frequency components are absorbed or dissipated, however, when the talker is distant from the listener and intelligibility is diminished. The voice quality, experienced when a listener is in close proximity to the talker, is known as presence, and is characterized by having an abundance of speech energy in the 2,000 to 4,000 Hz region. This region is defined as the presence band.

It is the function of the present invention to continuously analyze audio speech signals in situations where the talker and listener are connected through complex interfaces of a typical broadcast system which tend to attenuate the presence band components, to determine the proportion of the total energy in the presence band" and to boost the amplitude of presence band signal components to a level to obtain a more optimum spectral distribution.

The dynamic presence equalizer of the invention is shown in block diagram form in FIG. 3, in a typical installation. An incoming signal from a source such as a tape playback machine, a local microphone, or a remote pickup, collectively represented by a console 10, is amplified by a suitable amplifier 12 designed to accurately control the level of the audio signal. A suitable circuit for this purpose is described in Kaiser et al. U.S. Pat. No. 3,260,957 entitled Compensated Platform Gain Control Apparatus. The level controlled signal is applied to a circuit 14, termed a dynamic presence filter (the details of which will be described later), the output of which may be applied through suitable output circuitry directly, or transmitted over a telephone line from the studio, to a transmitter 16. Prior to application to the transmitter, the signals are preferably peaklimited by a peak limiter 18, which may be of the form described in Torick et al. U.S. Pat. No. 3,398,381 enti tled Control Circuit for Restricting Instantaneous Peak Levels of Audio Signals."

The dynamic presence filter 14, the circuit details of which are shown in FIG. 4, is designed to have the response characteristic illustrated in FIG. 5, so as to selectively boost the presence band components of the frequency spectrum in accordance with the 70 phon equal loudness contour of FIG. 2. The incoming signal from automatic gain control circuit 12 is applied via terminals 20 and 22 and a coupling transformer 24 to an emitter follower circuit including transistors 26 and 28. The signal appearing across the secondary winding of the transformer is applied in push-pull relationship to the two transistors by equal-valued resistors 23 and 25 connected between their base electrodes, with the junction therebetween connected to ground potential. The collector electrodes of the transistors are connected together and to a source of positive energizing potential, represented by terminal 30, and the emitter electrodes of transistors 26 and 28 are respectively connected through equal-valued resistors 32 and 34 to a negative source of energizing potential, represented by terminal 36.

The desired response characteristic illustrated in FIG. 5 is obtained by utilizing a two-loop filter network to which the signals appearing at the emitter electrodes of transistors 26 and 28 are applied in parallel. The outer loop comprises a pair of equal resistors 38 and 40 connected between respective emitter electrodes of transistors 26 and 28 and inverting and non-inverting input terminals, respectively, of an operational amplifier 42. The operational amplifier, which may be Type MC1433G, available from Motorola is energized from potential sources 30 and 36, respectively, and is provided with a feedback loop comprising the parallel combination or resistor 48 and capacitor 50 connected from its output terminal 44 to the inverting input terminal. Inasmuch as the outer loop contains balanced, purely resistive components, it is not frequency dependent and has a fixed gain with respect to the input, providing the flat response illustrated in FIG. 5. The function of capacitor 50 is to give a gentle roll-off to frequency components out of the audio range to insure stable operation of the device.

The inner loop of the filter network, the characteristics of which are dependent on the magnitude of an applied control current, comprises a balanced network including equal valued resistors 52 and 54, connected to the emitter electrodes of transistors 26 and 28, respectively, and in series with another pair of equalvalued resistors 56 and 58 to respective input terminals of operational amplifier 42. Control of the characteristics of the filter is provided by a balanced light dependent resistor (LDR) lamp assembly 60 connected across the two paths, with one of the light dependent resistors 62 connected in shunt with resistor 52, and the other resistor 64 connected in shunt with resistor 54. The resistance of resistors 62 and 64, which have a value of the order of 10 ohms when unilluminated, is dependent upon the illumination from an incandescent lamp 66, one terminal of which is connected to ground potential and to the other terminal of which is applied a control signal having a magnitude inversely proportional to the relative amount of energy instantaneously appearing in the presence band. The manner in which this control signal is obtained will be described later. Suffice it to say at this juncture that with maximum rated current through the lamp 66, the resistance of resistors 62 and 64 is reduced to approximately 1,000 ohms. Thus, when lamp 66 is illuminated, the series resistance of the branches of the inner loop (which is determined primarily by resistors 62 and 64) decreases with the result that a significant amount of signal flows through the inner loop to the input terminals of the operational amplifier 42. The larger the control current applied to the lamp 66, the greater the signal flowing through the inner loop.

The signals flowing through the inner loop are modified by a parallel resonant circuit connected between the two paths comprising resistor 68, capacitor 70 and inductor 72 which have such relative values to be resonant at approximately 3.5 KHz. Typically, resistor 68 may have a value of 1.2K ohms, capacitor 70 a value of 0.039uF, and inductor 72, SOmH. As the control current applied to lamp 66 increases, more of the signal is modified by the parallel resonant circuit, thereby to boost the response of the filter in the presence band region in the manner illustrated in FIG. 5. The circuit components are so selected that with maximum control current the response is boosted approximately db at the resonant frequency. At maximum boost the frequency response above 1,000 I-Iz will correspond to the earlier-defined 70 phon equal loudness contour.

Since it is desirable to introduce boost of the presence band frequency components on a dynamic basis, the just-described balanced configuration is necessary to prevent transmission of the control signal applied to lamp 66 to the output of amplifier 42. By using the balanced LDR, the control signal appears equally on the two inputs of amplifier 42 and consequently are cancelled, thereby precluding generation of thump which would occur in a single-sided circuit.

Returning now to FIG. 3, the control signal for dynamic presence filter 14 is obtained by a control loop connected between the output terminal 44 of the dynamic presence filter and the lamp 66 (FIG. 4). The signal appearing at output terminal 44 is amplified by a suitable amplifier 80, the output of which is applied in parallel to two circuit paths, the first of which includes a rectifier 82 which delivers a positive-going direct current signal representative of the instantaneous signal amplitude of all frequencies in the audio band of interest. In the other path, the signal is fed through a band pass filter 84, which is tuned to the same center frequency as the dynamic presence filter 14 so as to pass only frequency components in the presence band," the output of which is amplified by a suitable amplifier 86 and then rectified by a rectifier 88, to provide a positive-going direct current signal representative of the instantaneous amplitude of the presence band frequency components in the signal. Variation of the gain of amplifier 86, by suitable control means diagrammatically illustrated as a potentiometer 90, to vary the amplitude of the direct current signal delivered by rectifier 88, provides a convenient method of adjusting the control level of the overall system.

The two positive-going direct current signals from rectifiers 82 and 88 are applied to a differential amplifier 92 which is operative to subtract the presencecomponent-representing direct current signal from the full-band-representing direct current signal. The resulting error signal is thus positive if the full band signal is low in presence frequency components (i.e., the broad band signal is greater than the presence signal), is zero if the direct current voltages from the two rectifiers are equal, and is negative if the amplified presence signal is greater than the broad band signal.

Because of the dynamic nature of the control loop, it is necessary to determine the most appropriate attack and release times for application of the control signal for best results. This is always a subjective determination, and a certain amount of trail and error is required to obtain most pleasing results, from the listeners standpoint, in the ultimately transmitted signal. In general, if the attack time is too fast the output of the dynamic presence filter tends to be choppy," and if it is too slow, the system will be unable to follow rapid changes in energy in the presence band. It has been found that an attack time of 100 milliseconds, and a release time of 2 or 3 seconds (this period being less critical than the attack time) provide satisfactory dynamic characteristics for the overall system. Accordingly, the output of differential amplifier 92 is shaped in a time constant circuit 94 designed to give appropriate attack and release time and is then applied through an emitter follower 96 to the ungrounded terminal of the filament of lamp 66 in dynamic presence filter 14 (FIG. 3). An inhibit gate 98 is connected in circuit between the time constant circuit 94 and emitter follower 96 for inhibiting the action of the control loop when music programming, rather than speech, is being broadcast.

A control signal for automatically inhibiting gate 98 when music signals are present is derived from a speech-music discriminator 100, to which the output of amplifier is also applied, which is operative to deliver a binary output of predetermined level, +10 volts, for example, in a typical system, when speech is present, so as to allow the control signal to be applied to emitter follower 96, and to apply a zero volt level signal to the gate when music is present, so as to prevent transmission of the control signal. It will be recognized that while one can unequivocally call a crisp news delivery speech, and a symphony music, there exists a large variety of in-between types of programming, such as singing commercials, opera, patter-songs, etc., which tend to make classification somewhat difficult. The speech-music discriminator 100 recognizes speech by its staccato nature and makes its decision thereon; it is so designed, however, that any musical accompaniment or background in programming which also includes speech will result in a music decision so as to inhibit application of the control signal.

The speech-music discriminator 100 of FIG. 3 is shown in greater detail, in block diagram form, in FIG. 6. Although circuits for automatically discriminating between speech and music forms are known, as for example by Jones U.S. Pat. No. 2,761,897, the system of FIG. 6 is preferred for reasons which will become apparent. As was noted earlier, this circuit relies for its operation on the fact that speech is both staccato in nature and during the period of a word has a wide dynamic range. The discriminator includes a logarithmic amplifier 102 to which the output signal from amplifier 60 is applied, the output of which is rectified by a rectifier 104. The combination of the logarithmic amplifier and rectifier was selected to exploit the large difference in short-term dynamic range between speech and music signals. By employing logarithmic amplification, the peak-to-valley ratio of the envelope of speech is exaggerated compared to that for music. The envelope of the signal delivered by rectifier 104 is amplified in an inverting DC amplifier 106 having circuit values and energizing potentials to cause the speech envelope to encompass a range of approximately 10 volts, varying from almost zero volts at speech troughs to close to +10 volts at the peaks. With this circuit configuration, the-envelope of a music-representing signal, on the other hand, will tend to peak at +10 volts but the troughs only rarely drop below about volts. A voltage comparator, such as a Schmitt trigger circuit 108, set to trigger close to zero volts, is thus only actuated by speech signals, with the result that the output of Schmitt trigger 108 is a series of pulses at the syllabic rate.

The pulse stream from trigger circuit 108 is integrated in a peak detector 110, the output of which is a direct current signal whose amplitude is proportional to the speechiness of the program. The amplitude of this signal will vary with the rate of speech of the talker and with the precision of his delivery, but seldom drops below +5 volts. Musical programming, on the other hand, only rarely results in a voltage greater than about +1 volt. Consequently, by applying the output of integrator l to another comparator, which may be a second Schmitt trigger circuit 112, set at a trigger level of approximately +4 volts, a binary function is obtained; for example, the circuit parameters of circuit 112 may be selected to deliver a +l0 volt output for speech signals, and a zero volt level for music. As was noted earlier, when a zero volt level is applied to inhibit gate 98 (FIG. 3) the control signal is not applied to the control lamp in the dynamic presence filter 14. Conversely, application to gate 98 of the +10 volt level permits the control signal to be transmitted to emitter follower 96 and thence to filter 14. Additionally, the maintenance of a +10 volt potential at the inhibit gate prohibits the control voltage from exceeding that level, effectively limiting maximum presence boost.

As an alternative to differential amplifier 92 for generating the control signal, an analog divider circuit, such as the GPS Corporation Model D5010, may be used. This divider circuit (which is also available from other vendors) derives the quotient of two variables X and Y such that the output equals IOX/Y).

Referring to FIG. 3A, which is a block diagram of only that portion of the dynamic presence equalizer differing from FIG. 3, the rectifiers 82' and 88 are arranged so as to provide negative output voltages for application to the analog divider 114. The full-bandrepresenting direct current signal from rectifier 82' is applied to the X terminal of divider 114; the presencecomponent-representing direct current signal from rectifier 88' is applied to the Y terminal. The resulting output of the divider is thus positive and large if the fullband-representing signal is low in presence frequency components (i.e., the broad band signal is greater than the presence signal), and is small if the input signal contains an adequate amount of presence." The resulting control signal, which is applied to the gate 98 after wave shaping, is seen to be mathematically the true inverse proportion of presence band information in the total signal and is thus independent of overall signal level. This results in a control signal independent of level changes, and the resultant action of the circuit is only responsive to changes in presence."

Also, although the circuit configuration of the dynamic presence filter shown in FIG. 4 provides a relatively simple and effective way of boosting presence band frequency components in response to a control signal proportional to the energy appearing in the presence band, ways of applying the control signal other than via the balanced light dependent resistors will now suggest themselves to ones skilled in the art. Moreover, although this circuit has particular utility in the disclosed dynamic presence equalizer, its application is not limited thereto. For example, by suitable selection of the parameters of the parallel resonant circuit, other portions of the frequency band may be selectively boosted, or if a series resonant circuit is substituted for the parallel resonant circuit, the system would be operative to boost all frequencies except a given band. All such modifications are within the contemplation of the invention.

From the foregoing it is seen that the system automatically and continuously analyzes the relative amount of energy in the presence band, and if determined to be insufficient, the amplitude of the presence band components is boosted in compensation. Moreover, the automatic speech-music discriminator removes the control signal when music is present, the dynamic presence filter in this case functioning as a flat amplifier. Thus, the system functions, without operator attendance, on speech programming alone, or on musical programming without disturbing the quality of the musical signals. Because the amount of energy in the presence band is generally so much less than in the rest of the band, the boosting of the presence band components has no noticeable effect on the overall volume of the signal. In other words, it is possible with the system of the present invention to make speech signals more intelligible without changing modulation of the signal. The peak limiter 18 is not pushed into action by the amplified presence band components and, accordingly, the modulation remains about the same but with more effectiveness.

In short, the benefit derived from use of the system is brighter, clearer and, hence, more readily understandable speech signals. Although the listener can readily perceive the improvement in overall intelligibility by performing an A-B comparison test with the unit in and out of service, psychoacoustic tests to obtain a more objective measurement of the efficacy of the device have demonstrated a dramatic improvement in word intelligibility. An experiment was designed wherein standard phonetically balanced word lists were masked with USASI noise, with an overall signal-tonoise ratio of 8 db. The signal was switched either through, or around the dynamic presence equalizer of the invention, masked, and then transmitted to a welltrained psychoacoustic test panel. The panel members wrote the word lists as they perceived them, and the percentage of correct answers was determined. The mean test score without the dynamic presence equalizer was 57 percent; however, when the equalizer was used, the mean test score improved to 72 percent. This rather dramatic improvement in word intelligibility was obtained without change in the program level or overall signal-to-noise ratio.

I claim 1. An audio frequency signal processing circuit normally having a substantially flat amplification vs. frequency characteristic over a frequency range of interest and operative to provide greater amplification of frequencies in a selected portion of said range, said circuit comprising,

an amplifier having first and second input terminals,

and circuit means for applying input signals having frequencies in said range of interest to said first and second input terminals in push-pull relationship, said circuit means including respective first and second substantially resistive signal paths, respective third and fourth circuit paths in parallel with said first and second resistive paths, respectively, a resonant circuit connected between said third and fourth circuit paths resonant at the center of said selected portion of said range of interest, said third and fourth circuit paths further including control means to which a control signal having an amplitude related to a characteristic of signals having frequencies in said selected portion is applied, said control means being operative in response to variations in the magnitude of said control signal to vary the proportion of said input signal carried by said 

1. An audio frequency signal processing circuit normally having a substantially flat amplification vs. frequency characteristic over a frequency range of interest and operative to provide greater amplification of frequencies in a selected portion of said range, said circuit comprising, an amplifier having first and second input terminals, and circuit means for applying input signals having frequencies in said range of interest to said first and second input terminals in push-pull relationship, said circuit means including respective first and second substantially resistive signal paths, respective third and fourth circuit paths in parallel with said first and second resistive paths, respectively, a resonant circuit connected between said third and fourth circuit paths resonant at the center of said selected portion of said range of interest, said third and fourth circuit paths further including control means to which a control signal having an amplitude related to a characteristic of signals having frequencies in said selected portion is applied, said control means being operative in response to variations in the magnitude of said control signal to vary the proportion of said input signal carried by said third and fourth circuit paths.
 2. A circuit according to claim 1 wherein said resonant circuit is a parallel resonant circuit.
 3. A circuit according to claim 1 wherein said control means includes first and second light dependent resistors connected in said third and fourth circuit paths, respectively, and a lamp positioned to illuminate said light dependent resistors and energized by said control signal. 